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Abstract 

We consider the lower-bounded facility location (LBFL) problem (also sometimes called load-balanced 
facility location), which is a generalization of uncapacitated facility location (UFL), where each open 
facility is required to serve a certain minimum amount of demand. More formally, an instance I of LBFL 
is specified by a set F of facilities with facility-opening costs {f}, a set T> of clients, and connection 
costs {cij} specifying the cost of assigning a client j to a facility i, where the Cy-s form a metric. A 
feasible solution specifies a subset F of facilities to open, and assigns each client j to an open facility 
G F so that each open facility serves at least M clients, where M is an input parameter. The cost 
of such a solution is J^ieF fi + Tlij c i{j)j> an d me 8 oa l i s to ^ n< ^ a feasible solution of minimum cost. 

The current best approximation ratio for LBFL is 448 |fl~8). We substantially advance the state-of-the- 
art for LBFL by devising an approximation algorithm for LBFL that achieves a significantly-improved 
approximation guarantee of 82.6. 

Our improvement comes from a variety of ideas in algorithm design and analysis, which also yield 
new insights into LBFL. Our chief algorithmic novelty is to present an improved method for solving 
a more-structured LBFL instance obtained from I via a bicriteria approximation algorithm for LBFL, 
wherein all clients are aggregated at a subset F' of facilities, each having at least aM co-located clients 
(for some a G [0,1]). One of our key insights is that one can reduce the resulting LBFL instance, denoted 
12(a), to a problem we introduce, called capacity-discounted UFL (CDUFL). CDUFL is a special case 
of capacitated facility location (CFL) where facilities are either uncapacitated, or have finite capacity 
and zero opening costs. Circumventing the difficulty that CDUFL inherits the intractability of CFL with 
respect to LP-based approximation guarantees, we give a simple local-search algorithm for CDUFL 
based on add, delete, and swap moves that achieves the same approximation ratio (of 1 + \/2) as the 
corresponding local-search algorithm for UFL. In contrast, the algorithm in [18] proceeds by reducing 
22(a) to CFL, whose current-best approximation ratio is worse than that of our local-search algorithm 
for CDUFL, and this is one of the reasons behind our algorithm's improved approximation ratio. 

Another new ingredient of our LBFL-algorithm and analysis is a subtly different method for con- 
structing a bicriteria solution for I (and hence, 12(a)), combined with the more significant change that 
we now choose a random a from a suitable distribution. This leads to a surprising degree of improve- 
ment in the approximation factor, which is reminiscent of the mileage provided by random a-points in 
scheduling problems. 
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1 Introduction 



Facility location problems have been widely studied in the Operations Research community (see, e.g., lfl3l ). 
In its simplest version, uncapacitated facility location (UFL), we are given a set of facilities with opening 
costs, and a set of clients, and we want to open some facilities and assign each client to an open facility so 
as to minimize the sum of the facility-opening and client-assignment costs. This problem has a wide range 
of applications. For example, a company might want to open its warehouses at some locations so that its 
total cost of opening warehouses and servicing customers is minimized. 

We consider the lower-bounded facility location (LBFL) problem, which is a generalization of UFL 
where each open facility is required to serve a certain minimum amount of demand. More formally, an LBFL 
instance X is specified by a set F of facilities, and a set V of clients. Opening facility i incurs a. facility- 
opening cost fi, and assigning a client j to a facility i incurs a connection cost Cij. A feasible solution 
specifies a subset F C F of facilities, and assigns each client j to an open facility G F so that each 
open facility serves at least M clients, where M is an input parameter. The cost of such a solution is the sum 
of the facility-opening and connection costs, that is, YlieF fi + Ylj c i(j)j> an ^ tne 8 oa l * s to ^ n( ^ a feasible 
solution of minimum cost. As is standard in the study of facility location problems, we assume throughout 
that CijS form a metric. We use the terms connection cost and assignment cost interchangeably in the sequel. 

LBFL can be motivated from various perspectives. This problem was introduced independently by 
Karger and Minkoff [8], and Guha, Meyerson, and Munagala (who called the problem load-balanced facility 
location) [|5l (see also (3l), both of whom arrived at LBFL as a means of solving their respective buy-at-bulk 
style network design problems. LBFL arises as a natural subroutine in such settings because obtaining a 
near-optimal solution to the buy-at-bulk problem often entails aggregating a certain minimum demand at 
certain hub locations, and then connecting the hubs via links of lower per-unit-demand cost (and higher 
fixed cost). LBFL also finds direct applications in supply-chain logistics problems, where the lower-bound 
constraint can be used to model the fact that it is not profitable or feasible to use services unless they satisfy 
a certain minimum demand. For example (as noted in ED), Lim, Wang, and Xu ifTTTl . use LBFL to abstract 
a transportation problem faced by a company that has to determine the allocation of cargo from customers 
to carriers, who then ship their cargo overseas. Here the lower bound arises because each carrier, if used, is 
required (by regulation) to deliver a minimum amount of cargo. Also, LBFL is an interesting special case of 
universal facility location (UniFL) lTT2l — a generalization of UFL where the facility cost depends on the num- 
ber of clients served by it — with non-increasing facility-cost functions. UniFL with arbitrary non-increasing 
functions is not a well-understood problem, and the study of LBFL may provide useful insights here. 

Clearly, LBFL with M = 1 is simply UFL, and hence, is NP-haid; consequently, we are interested in 
designing approximation algorithms for LBFL. The first constant-factor approximation algorithm for LBFL 
was devised by Svitkina |[T8l . whose approximation ratio is 448. Prior to this, the only known approximation 
guarantees were bicriteria guarantees. [H and independently devised (p, a)-approximation algorithms 
via a reduction to UFL: these algorithms return a solution of cost at most p times the optimum where each 
open facility serves at least aM clients (a < 1, p is a function of a). 

Our results and techniques. We devise an approximation algorithm for LBFL that achieves a substantially- 
improved approximation guarantee of 82.6 (Theorem 13.11 ). thus significantly advancing the state-of-the-art 
for LBFL. Our improvement comes from a combination of ideas in algorithm design and analysis, and yields 
new insights about the approximability of LBFL. In order to describe the ideas underlying our improvement, 
we first briefly sketch Svitkina's algorithm. 

Svitkina's algorithm begins by using the reduction in |8]|5] to obtain a bicriteria solution for X, which is 
then used to convert X into an LBFL instance X-2 with facility-set J'C J having the following structure: (i) 
all clients are aggregated at F' with each facility i G F' having ni > aM co-located clients; (ii) all facilities 
in F' have zero opening costs; and (iii) near-optimal solutions to X2 translate to near-optimal solutions to 
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X (and vice versa). The goal now is to identify a subset of F' to close, such that transferring the clients 
aggregated at these closed facilities to the remaining (open) facilities in F' ensures that each remaining 
facility serves at least M demand (and the cost incurred is "small"). |[T8l shows that one can achieve this 
by solving a suitable CFL instance. Essentially the idea is that a facility i that remains open corresponds 
to a demand point in the CFL instance that requires M — rii units of demand, and a facility i that is closed 
maps to a supply point in the CFL instance having n\ units that can be supplied to demand points (i.e., open 
facilities). Of course, one does not know beforehand which facilities will be closed and which will remain 
open; so to encode this correspondence in the CFL instance, we create at every location i G F', a supply 
point with (suitable opening cost and) capacity M, and a demand point with demand M — rii if rii < M (so 
the supply point at i has rii residual capacity after satisfying this demand). (Assume n,i < M for simplicity; 
facilities with rii > M are treated differently.) Finally, |fl8l argues that a CFL-solution (where a supply point 
may end up sending less then rii supply to other demand points) can be mapped to a solution to I2 without 
increasing the cost incurred by much; since CFL admits an 0(l)-approxfmation algorithm, one obtains an 
0(l)-approximate solution to X2, and hence to the original LBFL instance X. 

Our algorithm also proceeds by (a) obtaining an LBFL instance X2 satisfying properties (i)-(iii) men- 
tioned above, (b) solving X2, and (c) mapping the X2-solution to a solution to X, but our implementation 
of steps (a) and (b) differs from that in Svitkina's algorithm. These modified implementations, which are 
independent of each other and yield significant improvements in the overall approximation ratio even when 
considered in isolation, result in our much-improved approximation ratio. We detail how we perform step 
(a) later, and focus first on describing how we solve X2, which is our chief algorithmic contribution. 

Our key insight is that one can solve the LBFL instance X2 by reducing it to a new problem we introduce 
that we call capacity-discounted UFL (CDUFL), which closely resembles UFL and admits an algorithm (that 
we devise) with a much better approximation ratio than CFL. A CDUFL-instance has the property that every 
facility is either uncapacitated (i.e., has infinite capacity), or has finite capacity and zero facility cost. The 
CDUFL instance we construct consists of the same supply and demand points as in the reduction of X2 to 
CFL in |[T8l . except that all supply points with non-zero opening cost are now uncapacitated. (An interesting 
consequence is that if all facilities in X2 have rii < M, the CDUFL instance is in fact a UFL-instance!) 

We prove two crucial algorithmic results. It is not hard to see that the "standard" integrality-gap ex- 
ample for the natural LP-relaxation of CFL can be cast as a CDUFL instance, thus showing that the natural 
LP-relaxation for CDUFL has a large integrality gap (see Appendix |A); in fact, we are not aware of any 
LP-relaxation for CDUFL with constant integrality gap. Circumventing this difficulty, we devise a local- 
search algorithm for CDUFL based on add, swap, and delete moves that achieves the same performance 
guarantees as the corresponding local-search algorithm for UFL 0]] (see Section |4~2"1 ). The local-search al- 
gorithm yields significant dividends in the overall approximation ratio because not only is its approximation 
ratio for CDUFL better than the state-of-the-art for CFL, but also because it yields separate (asymmetric) 
guarantees on the facility-opening and assignment costs, which allows one to perform a tighter analysis. 
Second, we show that any near-optimal CDUFL-solution can be mapped to a near-optimal solution to X2 
(see Section |4~TT ). As before, it could be that in the CDUFL-solution, a supply point i (which corresponds to 
facility i being closed down) sends less than rii supply to other demand points, so that closing down i entails 
transferring its residual clients to open facilities. But since some supply points are now uncapacitated, it 
could also be that i sends more than rii supply to other demand points. We argue that this artifact can also 
be handled without increasing the solution cost by much, by opening the facilities in a carefully-chosen 
subset of {i} U {demand points satisfied by 1} and closing down the remaining facilities. For every value 
of a (recall that the LBFL instance X2 is specified in terms of a parameter a), the resulting approximation 
factor for X2 (Theorem 13.51 ) is better than the guarantee obtained for X2 in Svitkina's algorithm; this in turn 
translates (by choosing a suitably) to an improved solution to the original instance. 

We now discuss how we implement step (a), that is, how we obtain instance X2. As in |[T8l . we arrive 
at X2 by computing a bicriteria solution to LBFL, but we obtain this bicriteria solution in a different fashion 
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(see Section [3). The reduction from LBFL to UFL in [|8j[5] proceeds by setting the opening cost of facility 

1 to fi + ' J2jev(i) c ij' where V(i) is the set of M clients closest to i, solving the resulting UFL in- 
stance, and postprocessing using (single-facility) delete moves if such a move improves the solution cost. 
We modify this reduction subtly by creating a UFL instance, where facility i's opening cost is instead set to 
fi + 2aMRi(a), where Ri(a) is the distance between i and the aAf-closest client to it. As in the case of 
the earlier reduction, we argue that each open facility i in the resulting solution (obtained by solving UFL 
and postprocessing) serves at least aM clients. The overall bound we obtain on the total cost now includes 

various Ri(a) terms. Instead of plugging in the (weak) bound MRi(a) < j \Z2 ~ (which would yield 
the same guarantee as that obtained via the earlier reduction), we are able to perform a tighter analysis by 
choosing a from a suitable distribution and leveraging the fact that M Ri(a)da = X^ex>(i) (This can 
easily be derandomized, since there are only M combinatorially distinct choices for a.) These simple modi- 
fications (in algorithm-design and analysis) yield a surprising amount of improvement in the approximation 
factor, which is reminiscent of the mileage provided by (random) a-points for various scheduling problems 
(see, e.g., lfT6l ) and UFL |[T5l[T7l . Also, we observe that one can obtain further improvements by using the 
local-search algorithm of JU [2 to solve the above UFL instance: this is because the resulting solution is 
then already postprocessed, which allows us to exploit the asymmetric bounds on the facility-opening and 
assignment costs provided by the local-search algorithm via scaling, and improve the approximation ratio. 

Finally, we remark that the study of CDUFL may provide useful and interesting insights about CFL. 
CDUFL is a special case of CFL that despite its special structure inherits the intractability of CFL with 
respect to LP-based approximation guarantees. If one seeks to develop LP-based techniques and algorithms 
for CFL (which has been a long-standing and intriguing open question), then one needs to understand how 
one can leverage LP-based techniques for CDUFL, and it is plausible that LP-based insights developed for 
CDUFL may yield similar insights for CFL (and potentially LP-based approximation guarantees for CFL). 

Related work. As mentioned earlier, LBFL was independently introduced by [8] and \5\. who used it as 
a subroutine to solve the (rent-or-buy and hence, the) maybecast problem, and the access network design 
problem respectively. Their ideas, which lead to bicriteria guarantees for LBFL, play a preprocessing role 
both in Svitkina's algorithm for LBFL |[T8l and (slightly indirectly) in our algorithm. 

There is a large body of literature that deals with approximation algorithms for (metric) UFL, CFL 
and its variants; see lfl4l for a survey on UFL. The first constant approximation guarantee for UFL was 
obtained by Shmoys, Tardos, and Aardal |[T5l via an LP-rounding algorithm, and the current state-of-the- 
art is a 1.488-approximation algorithm due to Li iflOl . Local-search techniques have also been utilized 
to obtain 0(l)-approximation guarantees for UFL J9j [21 LT1 - We apply some of the ideas of [HQ]] in our 
algorithm. Starting with the work of Korupolu, Plaxton, and Rajaraman |9j, various local-search algorithms 
with constant approximation ratios have been devised for CFL, with the current-best approximation ratio 
being 5.83 + e lfl9l . Local-search approaches are however not known to work for LBFL; in Appendix iBl 
we show that local search based on add, delete, and swap moves yields poor approximation guarantees. 
Universal facility location (UniFL), where the facility cost is a non-decreasing function of the number of 
clients served by it, was introduced by [HE], and lTT2l gave a constant approximation algorithm for this. We 
are not aware of any work on UniFL with arbitrary non-increasing functions (which generalizes LBFL). AH 
give a constant approximation for the case where the cost-functions do not decrease too steeply (the constant 
depends on the steepness); notice that LBFL does not fall into this class so their results do not apply here. 

2 Problem definition and notation 

Recall that we have a set J 7 of facilities with facility-opening costs {fi}, a set V of clients, metric connection 
(or assignment) costs {cy} specifying the cost of assigning client j to facility i, and a (integer) parameter 
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M. Our objective is to open a subset F of facilities and assign each client j to an open facility i(j) G F, so 
that at least M chents are assigned to each open facility, and the total cost incurred, YlieF fi + c i(j)j> * s 
minimized. We use X to denote this LBFL instance. 

Let F* and C* denote respectively the facility-opening and assignment cost of an optimal solution to X; 
we will often refer to this solution as "the optimal solution" in the sequel. We sometimes abuse notation and 
also use F* to denote the set of open facilities in this optimal solution. Let OPT = F* + C* denote the total 
optimal cost. For a facility i 6 J, let V(i) be the set of M clients closest to i, and Ri(a) denote the distance 
between i and the \aM] -closest client to i; that is, if V(i) = {ji, . . . , Jm}, where < . . . < (Hj M , then 
Ri(a) = Cjjr M , (for < a < 1). Let R*(a) = SigF* Ri( a )- Observe that each Ri(a) is an increasing 

functionofa,M/ 1 ^(a)rfa = E^ W ^,and^(a) < {E,ev {l )^)/(M - \aM] +1) < ^ffffi . 
Hence, R*(a) is an increasing function of a, M R*(a)da < C* , and R*(a) < M (\_ a ^ ■ 

3 Our algorithm and the main theorem 

We now give a high-level description of our algorithm using certain building blocks that are supplied in the 
subsequent sections. Let X denote the LBFL instance. 

(1) Obtaining a bicriteria solution. Construct a UFL instance with the same set of facilities and clients, 
and the same assignment costs as X, where the opening cost of facility i is set to fi + 2aMRi (a). Use the 
local-search algorithm for UFL in (2l or 0]] with scaling parameter 7 > to solve this UFL instance. (We 
set a, 7 suitably to get the desired approximation; see Theorem 13.11 ) Let J' C Jbe the set of facilities 
opened in the UFL-solution. Claim |3T2l and Lemma |3~3l show that each % G F' serves at least aM clients. 

(2) Transforming to a structured LBFL instance. We use the bicriteria solution obtained above to trans- 
form X into another structured LBFL instance X2 as in lTT8l . In the instance X2, we set the opening cost 
of each % € T' to zero, and we "move" to i all the m > aM clients assigned to it, that is, all these 
clients are now co-located at i. So X2 consists of only the points in J 7 ' (which forms both the facility-set 
and client-set). We will sometimes use the notation Ti{a) to indicate explicitly that X2's specification 
depends on the parameter a. 

(3) Solve X2 using the method described in Section @] Obtain a solution to X by opening the same facilities 
and making the same client assignments as in the solution to X2. 

Analysis. Our main theorem is as follows. 

Theorem 3.1 For any a G (0.5, 1] and 7 > 0, the above algorithm returns a solution to X of cost at most 
F* (1 + 7/1(0)) +C*(2h{a) - 1 + f) +2~faMR*(a)h(a) +2aMR*(a) 

where h(a) = 1 + ^ + 2 a-i 2a-v Thus, we can compute efficiently a solution to X of cost at most: 

(i) 92.84 • OPT, by setting a = 0.75,7 = 3/fc(a); 

(ii) 82.6 • OPT, by letting 7 be a suitable (efficiently-computable) function of a, and choosing a randomly 
from the interval [0.67, 1] according to the density function p(x) = \ n (i/Q^ x - 

The roadmap for proving Theorem 13. H is as follows. We first bound the cost of the bicriteria solution 
obtained in step (1) in terms of OPT (Lemma |3.3I >. This will allow us to bound the cost of an optimal 
solution to X2, and argue that mapping an Xrsolution to a solution to X does not increase the cost by much 
(Lemma [3^41 . The only missing ingredient is a guarantee on the cost of the solution to X2 found in step (3), 
which we supply in Theorem [33] whose proof appears in Section [4] 

The following claim follows from essentially the same arguments as in (HE]. 
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Claim 3.2 Let S' be a delete-optimal solution to the above UFL instance; that is, the total UFL-cost does 
not decrease by deleting any open facility of S'. Then, each facility of S' serves at least aM clients. 

The local-search algorithms for UFL in flUCD have the same performance guarantees and both include a 
delete-move as a local-search operation, so upon termination, we obtain a delete-optimal solution^ Observe 
that opening the same facilities and making the same client assignments as in the optimal solution to X 
yields a solution S to the UFL instance constructed in step (1) of the algorithm with facility cost F s < 
F* + 2aMR*(a) and assignment cost C s < C* . Combined with the analysis in El CD, this yields the 
following. (For simplicity, we assume that all local-search algorithms return a local optimum; standard 
arguments show that dropping this assumption increases the approximation by at most a (1 + e) factor.) 

Lemma 3.3 For a given parameter 7 > 0, executing the local-search algorithm in /[2] [7]/ on the above UFL 
instance returns a solution with facility cost Fb and assignment cost Cj, satisfying Fj, < F* + 2aMR* (a) + 
2C*/7, Cb < ^{F* + 2aMR*(a)) + C*, where each open facility serves at least aM clients. 

Lemma 3.4 ( 11181 ) (i) The (assignment) cost C^- 2 of an optimal solution to X2 is at most 2(C\, + C*). 
(ii) Any solution to I2 of cost C yields a solution to I of cost at most Fb + C& + C. 

Theorem 3.5 For any a > 0.5, there is a g(a)-approximation algorithm for 12(a), where g(a) = ^ + 

2a 



a-l + 2 V a* + 2, 



Remark 3.6 Our g(a)-approximation ratio for Z2(a) improves upon the approximation obtained in |[T8l by 
a factor of roughly 2 for all a. Thus, plugging in our algorithm for solving I2 in the LBFL-algorithm in |[T8l 
(and choosing a suitable a), already yields an improved approximation factor of 218 for LBFL. 



Proof of Theorem EH: Recall that h(a) = 1 + J + + 4 y 2^T- Note 2 #( a ) + 1 ^ h ( a ) 
for all a G [0, 1]; we use this upper bound throughout below. Combining Theorem [33] and the bounds in 
Lemmas [3.31 and [3~4l we obtain a solution to I of cost at most Fb + (2g(a) + l) C& + 2g(a)C* 

2C* / \ 

< F* + 2aMR* (a) + — + h(a)-y ( F* + 2aMR* (a) ) + (2h(a) - l) C* 

a). 
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F*(l + 7 /t(a)) +C*(2h(a) - 1 + | ) + 2-/aMR* (a)h(a) + 2aMR* ( 



Part (i) follows by plugging in the values of a and 7, and using the bound R*(a) < 



C* 



Af(l-a) ' 



Let p = 0.67. For part (ii), we set 7 = where K = ( ln 2 (l//3) • E a [h(a)} /( ^S^ ) 

■Jh(a) V " 

Plugging in this 7, we see that the cost incurred is at most 

F*(l + Ky/h{a)) +C*(2h(a) -1 + |VM")) + 2KaMR* (a) \J h(a) + 2aMR*(a). 



We now bound the expected cost incurred when one chooses a randomly according to the stated density 
function. This will also yield an explicit expression for K (as a function of /?), thus showing that K (and 
hence, 7) can be computed efficiently. We note that E \[X < a/E [X] and utilize Chebyshev's Integral 
inequality (see Q): if / and g are non-increasing and non-decreasing functions respectively from [a, b] to 



'A subtle point is that typically local-search algorithms terminate only with an "approximate" local optimum. However, one 
can then execute all delete moves that improve the solution cost, and thereby obtain a delete-optimal solution. 
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then J a b f(x)g(x)dx < ^ a i^^iL 9^)dx) ^ 0bserve that h ^ d ecreases w j t h a _ Recall that /3 = 0.67. 



We have the following. 

E a [h(a)] = c 2 (/3) := [1 - 4 + 8v^(tt/4 - tan-^v^^l)) + 21n(^y 

E a [aMR*(a)] = mU R*{x)dx^j / ln(l//3) < C*/hi{l/p). 



+ MV/3) IHVP) 



Finally, using Chebyshev's inequality, we obtain that 

E, 



aMR*(a)y/h( 



Oil 



< 



- fg dxy/h(x) 



M( I R*(x)dx^ 



1-/3 



/ln(l//3) < C*y%W) IHVP) 



where 



h{x)dx) /{1-P)= 4 ln(l//3) + 4^6(1 - ^2/3 - l) + 3(1 - 0) + In , 
L \Zp - L 



1 



The second inequality follows since [fl dxT/h(x)) /(l — /3) = E„„ 



uniform in [/3, 1] 



\A(a) ■ Plugging in 



these bounds, we get that K = (ln 2 (l//3)c2(/3)/c3(/3)) ' 25 and the total cost is at most 



4 Solving instance T 2 {ot) 

We now describe our algorithm for solving instance X2(a) and analyze its performance guarantee, thereby 
proving Theorem 13.51 As mentioned earlier, one of the key differences between our algorithm and the one 
in |[T8l is that instead of reducing Z2 to capacitated facility location (CFL), we solve Z2 by reducing it to a new 
problem that we call capacity-discounted UFL (CDUFL). CDUFL is a special case of CFL where all facilities 
with non-zero opening cost are uncapacitated (i.e., have infinite capacity). Perhaps surprisingly, despite 
this special structure, CDUFL inherits the intractability of CFL with respect to LP-based approximation 
guarantees: there is no known LP-relaxation for CDUFL that has constant integrality gap; AppendixlAlshows 
that the natural LP-relaxation for CDUFL has bad integrality gap. However, as we show in Section l4~2l we 
can obtain a simple local-search algorithm for CDUFL whose approximation ratio is better than the current- 
best approximation for CFL. 

Recall that Z2 has only the points in F' C F, and there are > aM co-located clients at each i € F' . 
Let l{i) = minj/ g jr/ j/^j cu>. To avoid confusion, we refer to the facilities and clients in the CDUFL instance 
as supply points and demand points respectively. The CDUFL instance created to solve Z2 resembles the 
CFL instance created in |[T8l ; the difference is that all supply points with non-zero opening costs are now 
uncapacitated. More precisely, at each i G F' , we create an uncapacitated supply point with opening cost 

5 minjnj, M}l(i), where 6 is a parameter we fix later. If ni > M we create a second supply point at i with 
capacity rii — M and zero opening cost. If rii < M, we create a demand point at i with demand M — rii. 
Let I' denote this CDUFL instance (see Fig. [[J. Let F u , F c denote respectively the set of uncapacitated and 
capacitated supply points of I'. Roughly speaking, satisfying a demand point i by non-co-located supply 
points translates to leaving facility i open in the Z2 solution; hence, its demand is set to M — rii, which is the 
number of additional clients it needs. Conversely, opening the uncapacitated supply point at i and supplying 
demand points from i translates to closing i in the Z2 solution and transferring its co-located clients to other 
open facilities. 
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(b) I', and a solution 5" for X' 



Figure 1 : (a) An X2 instance. Each box denotes a facility, and the number inside the box is the number of 
co-located clients; a dashed arrow i — >■ %' denotes that %' is the closest facility to i. 

(b) The corresponding X' instance. The boxes and circles represent supply points and demand points respec- 
tively, and points inside a dotted oval are co-located. A solid box denotes an uncapacitated supply point, 
and a dashed box denotes a capacitated facility whose capacity is shown inside the box. The number inside 
a circle is the demand of that demand point. The arrows indicate a solution S to X', where i and %' are the 
two open uncapacitated supply points. 

Lemma 4.1 ( [18j) There exists a solution to X' with facility cost F < 5C X2 and assignment cost C < Cj 2 . 

Theorem 4.2 ( i) Given any CDUFL instance, one can efficiently compute a solution with facility-opening 
cost F < F so1 + 2C so1 and assignment cost C < F so1 + C so \ where F so1 and C so1 are the facility and 
assignment costs of an arbitrary solution to the CDUFL instance. 

( ii) Thus, Lemma \4.1\ implies that one can compute a solution to X' with facility cost Fx' and assignment 
cost Op satisfying F v < (2 + 5)C| 2 , C T < (1 + S)C^. 

We defer the description of the local-search algorithm for CDUFL, and the proof of Theorem 14.21 to Sec- 
tion H72] We first describe how to convert an X'-solution to a solution to X2 with a small increase in cost, and 
show how this combined with Theorem 14. 2 1 leads to the approximation bound for X2 stated in Theorem 13.51 

4.1 Mapping an J'-solution to an X 2 -solution 

An X'-solution need not directly translate to an X2 solution because an open supply point i may not supply 
(and hence, transfer) exactly rii units of demand (see, e.g., i and i' in Fig.QJb)). Since we have uncapacitated 
supply points, we have to consider both the cases where i supplies more than rii demand (a situation not 
encountered in |[T8l ). and less than m demand. Suppose that we are given a solution S to X' with facility 
cost F s and assignment cost C s (see Fig. (Hb)). Again, we abuse notation and use F s to also denote the 
set of supply points that are opened in S. Let N% initialized to rii keep track of the number of clients at 
location i G F 1 . Our goal is to reassign clients (using S as a template) so that at the end we have Ni = or 
Ni > M for each i G F'. Observe that once we have determined which facilities in F' will have Ni > M 
(i.e., the facilities to open in the X2-solution), one can find the best way of (re)assigning clients by solving a 
min-cost flow problem. However, for puiposes of analysis, it will often be convenient to explicitly specify 
a (possibly suboptimal) reassignment. We may assume that: (i) F c C F s ; (ii) if S opens an uncapacitated 
supply point located at some i e J' with ni > M, then the demand assigned to the capacitated supply 
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point at i equals its capacity — M; (iii) for each i G F' with rtj < M, if the supply point at i is open 
then it serves the entire demand of the co-located demand point; and (iv) at most one uncapacitated supply 
point serves, maybe partially, the demand of any demand point; we say that this uncapacitated supply point 
satisfies the demand point. We reassign clients in three phases. 

Al. (Removing capacitated supply points) Consider any location i 6 J' with rij > M. Let i 1 and i 2 

denote respectively the capacitated and uncapacitated supply points located at i. If i 1 supplies x units to 
the demand point at location i', we transfer x clients from location i to i'. Now if i 1 has y > leftover 
units of capacity in S, then we "move" y clients to i 2 (which is not open in S). We update the A^s 
accordingly. Note that this reassignment effectively gets rid of all capacitated supply points. Thus, there 
is now exactly one uncapacitated supply point and at most one demand point at each location i G F'\ 
we refer to these simply as supply point i and demand point i below. 

Let Xi be the total demand from other locations assigned to supply point i. Let F G = {i G F' : N < Xi}, 
F R = {i G F' : Ni > X{ > 0}, and F B = {i G F' : Xi = 0}. which is the set of supply points that 
are not opened in S. Note that iVj > min{nj, M} > aM for all i G F', and N = min{nj, M} for all 
i G F R U F G (because of properties (ii) and (iii) above). 

A2. (Taking care of F R and demand points satisfied by F R ) For each i G F R , if i supplies x units to 
demand point i' , we move x clients from i to i' , and update Ni, Ny. We now have Ni = min{nj, M} — 
Xi residual clients at each i G F R , which we must reduce to 0, or increase to at least M. We follow the 
same procedure as in |[T8l . which we sketch below. 

For each i G F R , we include an edge («,«') where i 1 G F' is the facility nearest to i (recall that 
cw = We use an arbitrary but fixed tie-breaking rule here, so each component of the resulting 

digraph is a directed tree rooted at either (i) a node r G F' \ F R , or (ii) a 2-cycle (r, r'), (r',r), 
where r, r' G F R . We break up each component T into a collection of smaller components as follows. 
Essentially, we move the residual clients of supply points in the component bottom-up from the leaves 
up to the root, cut off the component at the first node u that accumulates at least M clients, and recurse 
on the portion of the component not containing u. More precisely, let T u denote the subtree of T rooted 
at node u G V (if u belongs to a 2-cycle then we do not include the other node of this 2-cycle in Y u ). 

- If Ylier Ni < M, or if V is of type (i) and all children u of the root satisfy ^ier„ N < M, we 
leave V unchanged. 

- Otherwise, let u be a deepest (i.e., furthest from root) node in V such that X^gr u N > M. We delete 
the arc leaving u. If this disconnects u from T \ T u , then we recurse on T \ T u . 

- Otherwise u must belong to the root 2-cycle of T. Let r' be the other node of this 2-cycle. If 
Ylier , Ni > M, we delete r''s outgoing arc (thus splitting V into T u and r r /). 

After applying the above procedure (to all components), if we are left with a component of type (ii) with 
Sic component N > M, we convert it to type (i) by arbitrarily deleting one of the arcs of the 2-cycle. 
Thus, at the end of this process, we have two types of components. 

(a) A tree T rooted at a node r: we move the Ni residual clients of each non-root node i G T to r. 

(b) A type-(ii) tree T with root {r, r'}: we must have J2ieT N < M. Let i' G F B be the location 
nearest to {r, r'}; we move the Ni residual clients of each i G T to i'. 

Update the AjS to reflect the above reassignment. Observe that we now have N = or N > M 
for each i G F R , and each i G F B has m > M, or is a demand point satisfied by a supply point in 
F G . Figure [2{ a) shows a snapshot after steps Al and A2 have been executed on the solution shown in 
Fig. Qtb). Here i' G F R has one client left after moving clients to the bottom two facilities, which is 
then transferred to i^. 
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A3. (Taking care of T G and demand points satisfied by F G ) For i e T G , let D(i) be the set of demand 
points j £ J 7 ', j 7^ i satisfied by i, and let D'(i) = {j £ D(i) : Nj < M}. Note that D{i) C F B . 
Phase A2 may only increase Nj for all j in J" B U F G , so iVj > aM for all j £ T G U (UiejrG . 

Fix % E J 70 . We reassign clients so that Nj = or Nj > M for all j £ {i} U D'(i), without decreasing 
Nj for j S D(i) \ D'(i). Applying this procedure to all supply points in T G will complete our task. 
Define Yj = M — Nj (which is at most M — rij) for j E D'(i). We consider two cases. 
~~ J2jeD'(i) < Ni. For each j £ D'{i), if i supplies x units to j, we transfer x clients from i to j. If 

i is now left with less than M residual clients, we move these residual clients to the location in D(i) 

nearest to i. 

~ HjeD'fi) Y j > N i ( see Fi S- EI>- Let *o = i, and D'(i) = . . . ,i t }, where c hi < ... < c iti . Let 

so I > 1 (and £ < t since iV io + N h > M). Note that t is 



Er=0 




Er=l ^ir 


M 




M 



YL + M. This enables us to transfer 



the unique index such that J2 r =e+i Y ir ^ Yl r =o N ir < H r =i+i 
Yi q clients to each i q , q = £ + 1, . . . , t from the locations it,. . . , io — we do this by transferring all 
clients of i r (where 1 < r < £) before considering i r -\ — and be left with at most M residual clients 
in {io, . . . , We argue that these residual clients are all concentrated at iq and i\, with i\ having 
at most (1 — a)M residual clients. We transfer these residual clients to ii+i. 




M = 8 



14 



1-2 



I -A 



(a) 



(b) 



Figure 2: The number inside a box is the current value of A^; the number labeling an arrow is the demand 
assignment of the X'-solution. The circles indicate demand points j with Nj < M. (a) The situation after 
running steps Al and A2 on the solution in Fig. (Hb). (b) The situation after running step A3. 



Theorem 4.3 The above algorithm returns an 12-solution of cost at most ^ -\-C s + 2 a~i )- Thus, taking 

S to be the solution mentioned in part (ii) ofTheorem \4.2\ and 5 = ^1 i/ a+ (2a)/(2a-i) "' we °btain a solution 
to ^(a) satisfying the approximation bound stated in Theorem \3.5\ 



Proof : Let S2 denote the solution computed for Z2. For a supply point i opened in S, we use Cf to denote 
the cost incurred in supplying demand from i to the demand points satisfied by i; so C s = YlieF s ■ ^ 
various steps, we transfer clients between locations according to the assignment in the CDUFL solution S, 
and the cost incurred in this reassignment can be charged against the C?s of the appropriate supply points. 
So the cost of phase Al is J2 iG jrc Cf , and the cost of the first step of phase A2 is ^2 ie jrn Cf . 
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As in |[T8l , we can bound the remaining cost of phase A2, incurred in transferring clients according to 
the tree edges by F s /5a+ (£ i6J ri? Cf) /(2a- 1). When we move clients up to the root of a component, we 
move strictly less than M clients along any edge (i, i') in that component, and since i £ F R , we pay at least 
5aMl{i) opening cost for i. The only unaccounted cost now is the cost incurred in step (b) of phase A2, 
where we have a tree T rooted at {r, r'}. Let i' £ T B be the location nearest to {r, r'}, and (say) Ci> r < cy r i. 
Note that we have already bounded the cost in transferring clients to r, so we only need to bound the cost 

incurred in transferring at most M clients from r to i'. This is at most M ■ ^ +x r ' / < (Cf + Cf ) / (2a — 1), 

because {r, r'} send X r + X r i = (n r + n r i) — (N r + N r >) > (2a — l)M units to demand points in F B , all 
of which are at distance at least Cj/ r from {r, r'}. 

Finally, consider phase A3 and some i £ T G . If X^'e-D'(i) X? — -^»> tnen ^ e cost incurred is at 

most Cf + M ■ S_ < Cf (l + i) (as Xj > N t > aM). Now consider the case J2jeD'(i) Y i > N i- 
For any i q £ {i^+i, . . . , it} and any i r £ {iq, . . . , i^}, we have C{ r i q < 2cu q , so the cost of transferring 
Y iq < M — n iq clients to each i g , q = £ + 1, . . . , t is at most 2Cf . Observe that (t-£+ 1)M > X]t=o > 
i.e., M + Xlg=£+i > ^r=o^v' so a ^ ter *- ms reassignment, there are less than M residual clients in 
io, . . . , i^. By our order of transferring clients, all these residual clients are at z'o, i\ (otherwise we would have 
at least A^ + Ni 1 > M residual clients) with at most M — Ni < (l — a)M of them located at i\. The cost of 

reassigning these residual clients is at most (1 — a)Mca 1 +Mcu e x < (1 — a)M ■ t i +M-=n — i , 
since Cf is the total cost of supplying at least Yj r demand to each i r , r = 1, . . . , t. The latter expression is 
at most Cf + ^_L_), sinC e ^t=i ^ > > <*M, E* = , +1 ^, > E5=o - M > (2a - 1)M.) 
Thus, the cost of S2 is at most 

Z+E C ^ E (1+^)+ E ^ . ma x{i+i 2+1^+^} < 

So if 5 is the solution given by part (ii) of Theorem 14.21 the cost of S2 is at most + ^ + (1 + <5)(- + 

2S=t))Cx 2 ' and P lu §§ in g in tne value of 5 y ields the 5(a) = I + 2^T + + 2^ approximation 

bound stated in Theorem [33] ■ 



4.2 A local-search based approximation algorithm for CDUFL 

We now describe our local-search algorithm for CDUFL, which leads to the proof of Theorem 14.21 Let 
jr _ jru^j -pc ^ e ^ e facility-set of the CDUFL instance, where T U C\T C = 0. Here, J ru are the uncapacitated 
facilities with opening costs {fi}, and facilities in F c have (finite) capacities {u{\ and zero opening costs. 
Let V be the set of clients and Cjj be the cost of assigning client j to facility i. The goal is to open facilities 
and assign clients to open facilities (respecting the capacities) so as to minimize the sum of the facility- 
opening and client-assignment costs. We can find the best assignment of clients to open facilities by solving 
a network flow problem, so we focus on determining the set of facilities to open. 

The local-search algorithm consists of three moves: add(i'), delete(i), swap(i, i'), which respectively, 
add a facility i' not currently open, delete a facility i that is currently open, and swap facility i that is open 
with facility i' that is not open. We note that all previous (local-search) algorithms for CFL that work with 
non-uniform capacities use moves that are more complicated than the moves above (and involve adding 
and/or deleting multiple facilities at a time). The algorithm repeatedly executes the best cost-improving 
move (if one exists) until no such move exists. (As mentioned earlier, to ensure polynomial time, we only 
consider moves that yield significant improvement and hence terminate at an approximate local optimum; 
but this has only a marginal effect on the approximation bound.) We assume for simplicity that each client 
has unit demand. This is without loss of generality because, even with non-unit client-demands, one can 
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compute the best local-search move (and hence run the algorithm), and for the purposes of analysis, one can 
always treat a client with integer demand d as d co-located unit-demand clients. 



Analysis. Let S denote a local-optimum returned by the algorithm, with facility-opening cost (and set of 
open facilities) F and assignment cost C. Let sol be an arbitrary CDUFL solution, with facility-cost (and 
set of open facilities) F so1 and assignment cost C so1 . Note that we may assume that JF C C F n F so1 . For a 
facility i, we use V^{i) and T> so \(i) to denote respectively the (possibly empty) set of clients served by i in 
S and sol. For a client j, let Cj and C| o1 be the assignment cost of j in S and sol respectively. 

We borrow ideas from the analysis of the corresponding local -search algorithm for UFL in HI, but the 
presence of capacitated facilities means that we need to reassign clients more carefully to analyze the change 
in assignment cost due to a local-search move. In particular, unlike the analysis in HI, where upon deletion 
of a facility s G F we reassign only the clients currently assigned to s, in our case (as in the analysis of 
local-search algorithms for CFL), we need to perform a more "global" reassignment (i.e., even clients not 
assigned to s may get reassigned) along certain (possibly long) paths in a suitable graph. This also means 
that we need to construct a suitable mapping between paths instead of the client-mapping considered in [T|. 

We construct a directed graph G with node-set V U F, and arcs from i to all clients in Vg(i) and arcs 
from all clients in V so i(i) to i, for every facility i. Via standard flow-decomposition, we can decompose G 
into a collection of (simple) paths V, and cycles 1Z, so that (i) each facility i appears as the starting point of 
max{0, |X>g(i)| — |'D so i(i)|} paths, and the ending point of max{0, |2? so i(*)l — l^gMI} paths, and (ii) each 
client j appears on a unique path Pj or on a cycle. Let V st (s) C V and V e " d (o) C V denote respectively 
the collection of paths starting at s and ending at o, and V(s,o) = V st (s) n V end (o). For a path P = 
{k,jo,k,ji, ■ ■ ■ ,ik,jk,ik+i ■= o} G V, define V(P) = {j , . . . ,j k }, head(P) = j , and tail(P) = j k . 
A shift along P means that we reassign client j r to i r+ \ for each r = 0, . . . , k (opening o if necessary). 
Note that this is feasible, since if o (E J~ c , we know that |'Z?g(o)| ^ (i'soil'-')! — 1 ^ u Q — 1. Let shift(P^) : — 
^-oex>(P) (Cj o1 — Cj) be the increase in assignment cost due to this reassignment, which is an upper bound 
on the actual increase in assignment cost if o is added to F. Also, let cost(P) := X^eSrp) (C| o1 + Cj) • We 

define a shift along a cycle R G 1Z similarly, letting shift(R) := ^2j G j) nR (Cj o1 — Cj). By considering a 
shift operation for every path and cycle inVVJlZ (i.e., suitable add moves), we get the following result. 



Lemma 4.4 For every o G F so1 and any Q C V enA {o), we have X^pgQ shift(P) > 
For every cycle R G TZ, we have shift (R) > 0. Thus, we have 6 < F so1 + C so1 . 



-fo ifoGF sol \F, 
otherwise. 



Bounding the opening cost of facilities in F \ F so1 . For this, we only need paths that start at facility in 
F\F so1 . Note that all facilities in (F\F sol )U (F sol \F) are uncapacitated. To avoid excessive notation, for 
a facility o G F sol \F, we now use "P end (o) to refer to the collection of paths ending in o that start in F\F so1 . 
(As before, V(s, o) is the set of paths that start at s and end at o.) For any o G F so1 \ F, we can obtain a 
1-1 mapping vr : V end (o) ^ V e " d {o) such that if P G V{s,o), s G F\ F so1 and tt(P) = P' G V{s',o), 
then (i) if \P(s, o)\ < J^M, we have s / s'; (ii) if s = s', then P = P'\ and (hi) ir(P') = P. Say that 
o G F so1 \ F is captured by s if \P(s, o)| > J^M. Note that o is captured by at most one facility in F. 
Call a facility in F \ F so1 good if it does not capture any facility, and bad otherwise. 

Lemma 4.5 For any good facility s, we have 

fs< Yl shi ft( p )+ Yl cost(ir(P)). (1) 
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Proof : Consider the move delete(s). We upper bound the increase in reassignment cost as follows. 
Consider j G T>g(s), and let Pj G V{s, o). (Recall that Pj is the unique path containing j.) If o G F n F so1 , 
then we perform a shift along Pj. Otherwise, let vr(Pj) G V(s' , o), where s' ^ s. We reassign all clients 
on Pj except tail(Pj) as in the shift operation, and reassign tail(Pj) to s'. Let k = tail(Pj). Since 
c s 'k < c s 'o + C| o1 < cost(ir(Pj)) + C| o1 , the increase in cost by reassigning clients on Pj this way is at 
most cost(ir(Pj)) + C| o1 — + Yl j' ev(p )\{k} {Cj? 1 ~ Cj')- Thus, the actual increase in cost due to this 
move, which should be nonnegative, is at most 

-fs+ Y shift(P)+ Y \ shift (P) + cost (tt(P)) . 

o£F,P£T(s,o) 0<£F,PeV(s,o) m 

Now consider a bad facility s. Let capt s C F so1 \ F be the facilities captured by s, and let o s G capt s 
be the facility nearest to s. 

Lemma 4.6 For any bad facility s, we have 

fs< fo+ E shlft{P)+ Yl COSt(TT(P))+ Y COSt(P). (2) 

oGcapt s p 6 pst( s ) Q oecapt s \{o s } 

PeV{s,o):n{P)^P P&V{s,o):-r(P)=P 

Proof : Consider the move swap(s, o s ). We reassign client j G F>^(s) as follows. Let Pj G V(s, o). 

• If o G F D F so1 , or o = o s and vr(Pj) = Pj, we perform a shift along Pj. The increase in assignment 
cost is at most shift (Pj). 

Otherwise, let ir(Pj) G V(s', o). 

• If vr(Pj) 7^ Pj (so s' ^ s), we reassign T>(Pj) \ {tail(Pj)} as in the shift operation, and assign tail(Pj) 
to s'. As in the proof of Lemma l4.5l the increase in assignment cost is at most shift(Pj) + cost(ir(Pj)y 

• If vr(Pj) = Pj (so o o s ), we assign j to o s . Note that c 0sJ - < Cj + c SOs < Cj + c so < Cj + cost(Pj), 
so the increase in assignment cost is at most cost(Pj). 

This gives the inequality 

0</ Os -/s+ Y shift(P) + Y Y [shift(P) + cost(ir(P)) 

P€V(s,o):o£F or o0 PeP{s,o):n(P)^P 

o=o s , tt(P)=P (3) 

+ Y E cos< ( p )- 

oiF:o^o s PGV(s,o):tt(P)=P 

Now consider the operation add(o) for all o G capt s \ {o s }, and apply Lemma |4~41 taking Q = {P G 

V(s,o) : 7r(P) = P}. This yields the inequality < f a + Spe7 : '(s,o):7r(P)=p shift (P) for each o G 

capt(s) \ {o s }. Adding these inequalities to ©, and rearranging proves the lemma. ■ 

Proof of Theorem [472]: We prove part (i); part (ii) follows directly from part (i) and Lemma l4~TI Lemma l474l 
bounds C. Consider adding CD) for all good facilities and © for all bad facilities, and the vacuous equality 
fi = fi for ai l i £ FnF so1 . The LHS of the resulting inequality is precisely F. The /jS on the RHS add up to 
give at most F so1 . We claim that each path P G U se p\psoi V st (s) contributes at most shift(P) + cost(P) = 

2 E je 5 ( p) Cj* to the RHS. Thus the RHS is at most F so1 + 2C so1 , and we obtain that F < F so1 + 2C so1 . 
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Each path P in {J s ^ FBO i oe p V(s,o) appears exactly once, either in CO or in ©, and contributes shift(P). 
Now consider a path P G \J s ^ Fso i ^pV(s,o), and let vr(P) = P' G V(s',o). Note that 7r(P') = P. If 
P' / P, then P appears twice in our inequality-system: once in the inequality for s contributing shift (P) 
(due to P), and once in the inequality for s' contributing cost(P) (due to P')- If P' = P, then s = s' 
and s is a bad facility; now P appears only in (O (for s) and contributes either shift (P) if o = o s , or 
shift (P) + cost(P) otherwise. ■ 

Corollary of Theorem I4.2t 77ier<e zj a (l + v2) -approximation algorithm for CDUFL. 

Proof : We take sol in part (i) of Theorem !4.2l to be an optimum solution (with cost F opt + C opt ) to the 
instance, and scale the facility costs by a before running the local-search algorithm. The solution returned 
has cost F + C < (P opt + § • C opt ) + (aP opt + C opt ) . Setting a = y/2 yields the result. ■ 
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A Integrality-gap example for the natural LP-relaxation for cdufl 

Let (J 7 = T u U F C ,V, {/»}, {v>i}, {%}) be a CDUFL instance with facility-set T (where U{ = oo for all 
i G F u , and fi = for all i G T c ), and client-set V. We consider the following LP-relaxation. We use i to 
index facilities, and j to index clients. Note that we may assume that all facilities in JF C are open. 

Cij %ij (LP) 

s.t. Xij > 1 for all j 

Xij < yi for all i G F u ,j 



c 



< Ui for all i G P 

j 

Xij,yi>0 foralH,j. 

Here yi denotes if facility i is open, and denotes if client j is assigned to facility i. (We assume that each 
client has unit demand.) 

Now consider the following simple CDUFL instance. We have two facilities i and i' , and u+1 clients, all 
present at the same location. Facility i is uncapacitated and has opening cost /, and facility i' has capacity 
u (and zero opening cost). Any solution to CDUFL must open facility % and therefore incur cost at least /. 
However, there is a feasible solution to (ILPb of cost -X^r\ we set v; = — tt, and xa = — tt, Xji~ = — ^r, 

' " ' u+l i)l u+1' °J J u+1 

Thus, the integrality gap of (ILPb is at least u + 1. 



B The locality gap of a local-search algorithm for LBFL 

We show that the local-search algorithm based on add, delete, and swap moves — that is, adding/dropping 
one facility (with add permitted only if it preserves feasibility), or deleting one facility and adding another — 
has a bad locality gap, which is the maximum ratio between the cost of a locally-optimal solution and 
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that of an (globally) optimal solution. Consider the LBFL instance shown below with facility-set T = 
{o, s±, S2, ■ ■ • , sm}, and client-set V = T>\ U T>2 U . . . U T>m, where the X>jS are disjoint sets of size M. 
The facility-opening costs are as follows: f a = M 2 + e and f Si = M for each i G {1,2,..., M}. For 
each i = 1, . . . , m and each client j E T>i, we have c a j = 1, c Si j = M. All other distances are defined by 
taking the metric completion with respect to these Cjj-s. One can verify that the solution S which opens the 
facilities {si, S2, ■ ■ ■ , %} is a local optimum. The cost of this solution is M 2 + Af 3 . However, the optimal 
solution opens facility {o}, and incurs a total cost of 2M 2 + e. Thus, the locality gap is at least M/2. 




We can modify this example to show that the locality gap remains bad, even if aim for a bicriteria 
solution and consider an add move to be permissible if every open facility can be assigned at least aM 
clients. The only change is that each set T>i now has aM clients: S is still a local optimum, and the locality 
gap is therefore at least aM/2. 



Bad example with zero facility-opening costs. Even in the setting where all facilities have zero open- 
ing cost (as in the X2 instance), we can construct bad examples for local-search based on add, delete, 
and swap moves. For simplicity, first suppose that M = 2. Consider a cycle with 4k nodes, which are 
labeled o , jo, so, ji, 01,32, S1J3, ■ ■ ■ ,o r ,j2r,s r ,j2r+i, ■ ■ ■ , o k -i, hk-2, Sfc-i, hk-i, 00 . We have 2k facil- 
ities T = {o , . . . , Ofc_i, s , • • • , Sfc-i}, and 2k clients V = {jo, ji, • • • ,j2k-l} (see Fig.©. We define the 
following distances. 

• C Oihi mod 2k = C Oij(2i-l) mod 2k = 1 f«T all j = 0, . . . fc — 1. 

• c s % n % = c Slj{2i+1) = k - e for alH = 0, . . . , k - 1. 

All other distances are defined by taking the metric completion with respect to these CijS. 

The solution S which opens facilities {so, si, . . . , Sk-i} is a local optimum: no add move is feasible, 
and it is easy to see that no delete move improves the cost. Consider a swap move, which we may assume 
is of the form swap(s r , oq) by symmetry. The new client-assignment will not necessarily assign the clients 
32r and 32r+i (which were previously assigned to s r ) to oq. However, the intuition is that the long cycle will 
lead to a large increase in assignment cost. The optimal way of reassigning clients is to assign j2fe— 1 , jo to 
o , assign j 2i+ i, 321+2 to Sj for i G {0, . . . ,r - 1} (which is empty if r = 0), and assign j 2 i,j2i-l to s, 
for i G {r + 1, . . . , k — 1} (which is empty if r = k — 1). The cost increase due to this reassignment is 
2(1 - k + e) + (fe - 1) • 2 > 0. Thus, S is a local optimum. 

The cost of S is 2k(k — e). However, the optimal solution opens facilities {00, . . . , o^-i}, and has a 
total cost of 2k. So this instance shows a locality gap of k, and since k can be made arbitrarily large, this 
shows an unbounded locality gap. 

The above example can be extended to all values of M. For each M, let G M be an M-regular bipartite 
graph with vertex set V = {01,02, oi\ U {si, S2, s^} with a large girth T. We use G M to construct 
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]2k-3 # # # • 32 

• • • 



Figure 3: Bad locality-gap example with facility costs 

the following LBFL instance. The set of facilities is {oi, . . . , og, si, . . . , se}. For each edge (s n , o m ) in G M , 
we create a client j nm with c Sn j nm = T — e and c 0m j nm = 1. As before, one can argue that the solution 5 
that opens facilities {si, S2, ■ ■ ■ , s^} is a local optimum. The cost of this solution is £M(T — e), whereas the 
solution that opens facilities {o\, . . . , o^} has total cost of IM. So the locality gap is T, which can be made 
arbitrarily large. 
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